![]() |
The Twofish Encryption Algorithm: A 128-Bit Block Cipher
by Bruce Schneier ; John Kelsey ; Doug Whiting ; David Wagner ; Chris Hall ; Niels Ferguson Wiley Computer Publishing, John Wiley & Sons, Inc. ISBN: 0471353817 Pub Date: 03/01/99 |
Previous | Table of Contents | Next |
Our current results use n = 211 tries for all differentials with one active S-box, and n = 28 tries for all differentials with 2 active S-boxes. The structure size is 8 and 16 respectively. We use γ = 0.05, and σ = 12/256. The full Twofish cipher has 16 rounds. We assume that an adversary can somehow bypass the first round and can mount a 3R-attack. We thus look at the best 12-round differential characteristic.
With these parameters we found an upper bound on the probability of a 12-round differential characteristic of 2-102.8. This puts a differential attack against Twofish well outside the practical realm.
This upper bound is pessimistic in the following areas:
To create an attack, the attacker has to choose a specific differential characteristic. That characteristic uses certain specific differences of each of the S-boxes. To get anywhere near the bound, all of these differences need to have a probability close to our σ. We chose σ equal to the probability of the best differential of most S-boxes. However, a specific differential will not have the same probability under all keys. If the S-box keys are not known, the attacker has two options. First, he can guess the S-box key bits, and construct a differential characteristic based on that assumption. To achieve good differential probabilities in enough S-boxes, he will have to guess the keys of at least two S-boxes (between 32 and 64 bits, depending on the key size). Alternatively, he can try to find a differential that works for all keys. As we saw in section 9.4.1, this leads to very low differential probabilities.
We use σ = 12/256 while we know that there are keys for which the best S-box differential has probability 18/256 for a 128-bit key, and even higher for larger key sizes. However, those higher probabilities only occur for a small subset of the keys. We need to address the question of how much we are willing to pay in the size of the keyspace the attack is effective on to get a higher probability. If we have an attack that works for 2-28 the key space, how much more efficient should the attack be before it is a better choice for the attacker?
The most natural way is to look at the expected work for the attacker before recovering a single key. We assume that there are enough keys to attack, and optimize the attackers strategy to find any one key with the least amount of work. This is a reasonable way of looking at the attackers problem. After all, we know there is an attack that is effective on a subset of 2-64 of the keys with 264 work: a simple exhaustive search of any subset of that size will do. With such a brute-force attack on a subset of the keys, the expected amount of work before a key is found remains the same.
Let us now look at our Twofish differentials. Suppose we want to use a differential of S-box 0 with probability 14/256; this is possible for about one in eight of all possible keys. We have restricted ourselves to 1/8 of the set of keys, so the workload of our attack should be reduced by at least a factor of 8 for this to be worthwhile. We ran our search for the best differential characteristic pattern again where S-box 0 had a best differential probability of 14/256. The resulting differential probability was 4.6 times higher than the result with σ = 12/256. Is this worth it?
Let us assume a differential has a probability that depends on the key. We have a list of (pi, ki) where the probability of the differential is at most pi for a fraction ki of all keys. The expected workload of the attacker to get a single right pair is 1/pi for a fraction ki of the key space, and thus ς ki/pi when taken over all keys. The workload is at least max ki/pi. This corresponds to the workload of an attack with a differential with probability min pi/ki. In our situation, the minimum occurs when we use the S-box approximation with probability 12/256. As we currently ignore the 1/ki term, the actual effective probability of a real differential is lower than the bound that we have derived.
We conclude that using a higher probability than 12/256 for an S-box approximation is not worth the loss in key space on which the approximation holds. Thus the bounds we presented earlier hold, and are in fact pessimistic.
As an experiment, we ran the same analysis for Twofish with the 1-bit rotations removed. This makes our approximations match the behavior of the differential much better. Our results give an upper bound on the probability of a 12-round differential characteristic of 2-104.1
This bound is not much better than we have for full Twofish. However, the sequence of differential patterns that achieves this bound uses far more approximations of F that have three active S-boxes. In all cases it uses differential characteristics of G that have three active input bytes and only a few active output bytes. In practice, such differentials of G will have a far lower probability than the upper bound of 1 that we currently use. Therefore, we expect that our bound can be improved to beyond 2-128.
We have no reason to believe that the 1-bit rotations make Twofish stronger against a differential attack. They were conceived to break up the byte-level structure, but they do not require a separate approximation or increase the avalanche effect of the cipher. We think it is unlikely that the full Twofish has a differential characteristic that is significantly more likely than the version without rotations.
We will continue our analysis work to improve the bound and our understanding of the intricacies of Twofish. We have several areas that we plan to improve.
Improved G Differential Estimates. An obvious way to improve our overall bound is to improve our bounds on the differentials of G. We hope to be able to do this in the near future. A larger sample size will improve the accuracy of our estimates. Extending our computations to differentials of G with three active S-boxes should give a great improvement.
More Accurate Patterns. Our current pattern-representation is somewhat coarse. We group differentials only by which bytes contain active bits. Apart from the first and final round, all internal differential patterns in our best result have at most four active bytes (out of 16). A more fine-grained grouping of the differentials could lead to a better upper bound.
For example, there are only a few G differences with one active input byte that have a relatively high probability. Instead of grouping these into the sets, we could treat them separately. This would ensure that our algorithm doesnt magically transform the output of the high-probability difference pattern to one with fewer active bytes by the rotation. The characterization could be extended with special cases for differences that have only a few active nibbles. We expect that this will result in more S-boxes being needed for a full differential characteristic, and thus a lower bound.
Improved Treatment of S-box Differentials. There is still room to improve our approximations of the S-boxes. We can, for example, compute the best differential approximation for each of the output differences separately. This can then be combined with the analysis of G to get a better bound on differentials of F.
The data on the best S-box differentials in section 7.2.3 is merged for all the S-boxes. We plan to test the differentials again and collect information for each S-box separately.
We will also improve our handling of the variation of differential probability over the key-space. This will also result in a better bound.
Additive Differentials. We would like to take a closer look at additive differentials modulo 232. Although we do not expect these to be more useful, it would be nice to derive some bound in that case too.
Previous | Table of Contents | Next |